Welcome to the Bluetooth Device Database. Ryan Holtman, take it away.
Thank you. Thanks for coming. Last talk of the day. Everyone having fun so far?
So this project is a little more high level than some of my previous projects. If you're
getting into Bluetooth, it's really fun to see it at this type of view. And I don't believe
that it's really ever been looked at in this way. And I believe that you really need to
look at devices in the wild in order to understand the technology. So basically, in short, this
project was to beat the largest scale Bluetooth device survey that I've seen out there. And
with it, I gave a lot of support for the community to participate. So I have a lot of clients
and I'll tell you how to install them later on. But if you want, go out and install some
of this stuff.
And send me your data. It's for everyone, not just for me. So a little bit about myself
before I get started. It's a great picture of me. I'm a senior server ‑‑ or I used
to be a server developer. I'm a security researcher at Ziffen Technologies where we
do a lot of end point security analytics and management solutions. If you haven't heard
about us, check us out. We do some cool stuff. In my spare time, I spend a lot of time with
Bluetooth. I have some Python Ubertooth projects that I have up on my GitHub repo. Fun stuff.
I have fun doing this stuff in my spare time. Follow me on Twitter to follow any other projects
I'm doing. And all this research today is kind of just of my own. It has nothing to
do with Ziffen. They just allow me to come out and present to you guys. So as I mentioned
before, at its core, the Bluetooth device is a little bit different. It's a little bit
different. So I'm going to show you around a little bit. I'm using it for the second
time today. The first time I was doing it, I came from the
Bluetooth database project. It's a one‑touch box. For the first time I was using it. I
didn't really know anything about that. So first time I was using it, I was just a little
with a lot of answers for them and learn some new things. And so all of this project was
done only on discoverable devices and a lot of people say, hey, why did you only do discoverable
devices? You do a lot of passive stuff in your spare time.
Well, the number one reason was convenience, right? I could write a simple iPhone app and
put it in my pocket and walk around and scan things. People don't really like it when you
walk around with big antennas and crowds and computers. So that was the main reason was
for convenience. That and secondly, there just wasn't anything done at this wide of
a scale before. And when we look at stuff with passive monitoring, we miss a lot of
the information that I was interested in, such as the upper half of the Bluetooth addresses.
So what was I interested in? Mostly Bluetooth addresses. But along with this comes device
information.
As far as the device name. Since I was doing all of this from mobile devices, geolocation
was really easy to add in there. And any other metadata that I get with a simple scan. So
this would be device class information, et cetera. I did nothing that was actually probing,
you know, Bluetooth ports or anything like that. Because that typically takes time and
by the time a Bluetooth object moves on, you wouldn't actually be able to complete the
scan. So what were the tools that I used for this project? Everything about this project
is open source.
I have the clients and I have a simple server too that you can implement if you don't want
to send all of your data up to my managed server. But the main client that I created
was an IOS client. It will not be in the app store. I had to leverage a lot of private
libraries and APIs that Apple does not, you know, allow to go through. Making the application
run in the background 24 7. So they don't allow you to put that stuff there. But it's up on my GitHub
repo you can compile it and throw it on your phone really easily. For cross‑platform support,
if you saw Joseph Cohen's talk a few ‑‑ like about an hour ago, actually in order
to support cross‑platform I hacked up one of his Blue Cat clients so I have a fork of
it on my repo and basically the data that it collects will ship off to the remote server
too. The server, if you want to participate in this project but you don't want your data
to go out to the public, I have a simple server implementation on my GitHub repo too
so you can kind of just change the URL of the client and have it report to your server
too. All the data that I collect or any data that
goes to my server, completely open to people. I do a database dump about once a week so
you can't use the information to track someone. I'm kind of giving them a week head start.
So there are similar projects. There was nothing that kind of like hit this broad of
a scale. But NAPNAP is something that is similar. This was a project run by Josh Wright where
he was trying to
correlate the upper half of an address which is vendor‑specific to the actual device
that it correlates to. And so this was kind of a little bit different. It didn't really
follow what I wanted to look at. JP Dunning, he did ‑‑ a couple years ago he did
the Bluetooth profiling project. And it was similar to mine. It didn't have the capability
for shipping to a remote server like these clients do. It didn't have geolocation information.
It was mostly looking at unique devices whereas repeated ‑‑
I had multiple use cases really important. Things like this. And, you know, Bluetooth
changes from year to year. His last skin was, you know, from, like two years ago. And so,
you know, you wouldn't see things that the last year or two. This stuff kind of changed
really quickly. There are some closed source projects that kind of do this detection stuff.
So wireless works is a company. If you've been paying attention to the media lately,
there's been a big stink about malls and department stores
tracking you based on your cell phone's wireless probing, right? Wireless works is kind of
the Bluetooth version of this where they track you going into a store based on Bluetooth
information. There's not a lot of information about them online. So if anyone is familiar
with them, I wouldn't mind talking to you after the talk.
Houston's Transar system, they use Bluetooth to ‑‑ even though I'm from Texas, I have
nothing to do with this project, but they use Bluetooth to detect traffic patterns in
Texas highways. So obviously these projects don't, you know, open the data sets up to
the public, so it wasn't much use to me. But it's interesting just to know that they're
out there. So the database right now currently has over
12,000 sightings and around 5,000 actual unique devices. This is kind of just like my collection
over time for the last couple months. As you can see, Vegas was pretty lucrative. It would
be the last couple days on the end of the time series.
And so one of the first questions I had, you know, what is the most popular discoverable
Bluetooth device out there? If you had seen Joseph Cohen's talk like an hour or so ago,
you might have a hint as to what it is. Anyone here want to take a guess?
No, no. It was Blackberry's actually. Blackberry's win by a landslide. So this is broken out
based on device name, which there is a bit of error here. But I had to munch some of
the data. A lot of the devices
that we get back in have generic names. You don't actually get the name on the first scan,
so they would just be kind of bucketed into mobile phone. So for this data set, I just
truncated that information off. But for the most part, if you take the top ten in this
list, it's pretty accurate as far as what's really out there. You know, Apple products,
MacBooks, iPhones, Roku, just going down a list of the top five, Bluetooth TV sets,
DirecTV, iMacs, iPads. I'll kind of get into some of this information a little bit.
So some of the cool things that we can do, I kind of truncated a lot of this material.
I thought this was a 20‑minute talk, but they gave me more, so I get to take my time
on this. But some of the cool things we can do with geolocation on this is the way that
I was collecting data was a lot different than a lot of other techniques, where most
techniques as far as like wireless works or the Texas Department of Transportation,
they're basically a stationary cell and basically you can assume that anything passing them
is a mobile device, right? So it's moving. So what we did was we took a lot of data from
this. With my survey, it was a little different. I was the moving device. So it's really hard
for me to determine am I seeing a moving device on the other side. And so in order to do this,
I would take ‑‑ I would have to see a device more than once, so I need two or more
sightings and I would take the two farthest geolocation points for that device in order
to correlate and bucket it into how much this device actually moves.
So for this, I'm getting about, you know, over 70% of the devices that I actually saw
were moving.
And you can kind of bucket them out into how much distance I actually saw them moving.
So because I'm one person, it was, you know, it was rare for me to see things at a higher
end chain, right? So you can see that I only had about 5% that I actually saw move more
than five kilometers at the end. Cool way to look at the data. And I liked it. Another
thing you can do that's pretty cool with geolocation information is you can look at the reoccurrence
of it in your data set.
So in this top picture here, this is a local Costco that I go to, you know, time after
time and I always have my phone on, you know, scanning the devices in there. So on this
map you can see the blue dots. And these are devices that are stationary and local to this
particular Costco that I go to. And all of the red dots would be, you know, devices which
are, you know, not local to this. So these would be most likely people, you know, traversing
the store at the time. So that's one way you can look at geolocation information with this
stuff, which was pretty neat. So that's one way you can look at geolocation information with this stuff, which was pretty neat.
The other way I call it solving the small world phenomena, where that would be assume
that, you know, you live in Cleveland and your friend lives in L.A. and somehow you
meet in Denver and say, oh, wow, what a small world. So you can kind of do this with
Bluetooth information, too. So this bottom picture here is the route that I skateboard
to work every day and it's an access road, so there's not a lot of traffic that goes
by. But the blue dots denoted here in this image are cars that I pass multiple times.
times, whereas the red dots are cars that I would never, you know, see again.
And so, I mean, this is something that you would never really realize on your own.
You're not going to memorize every car that you pass every day.
So it's kind of a cool way of looking at the data set.
And so on to geolocation was cool.
It wasn't really why I got into this project, it was just something I could tack on in order
to kind of see the data in a different way.
I was mostly interested in the Bluetooth address space.
I threw this slide in here because I understand that not everyone is familiar with the Bluetooth
address space.
So this is my quick primer of it.
So in Bluetooth, addresses are laid out a lot like a Mac address and basically you have
the upper half being vendor specific and the lower half being device specific.
The device specific half, the LAP here is supposed to be unique across devices.
And once you get into the vendor specific part, we kind of split it up into the NAP
and the LAP.
UAP is something that when we do passive monitoring techniques in Bluetooth, we don't
always get it.
And the NAP we never get.
And even though all of this data that I actually did the research on was for discoverable devices,
I wanted to take that data set and use it in my techniques for passive Bluetooth monitoring.
So I don't know if I mentioned, but basically when we do passive Bluetooth monitoring, the
LAP is the only thing we're guaranteed ever.
So one of the things that I wanted to talk about is the LAP.
One of the things that I really wanted to determine, if I were to go back and do this
whole survey again with only, you know, Bluetooth passive monitoring, I would most likely only
get LAP addresses for geolocation.
And so what I really wanted to know is are LAP addresses unique or are vendors just kind
of printing and pressing them out and, you know, reusing the same LAP over and over.
And lo and behold, it turns out that they're not.
LAPs are actually pretty evenly distributed.
So of all of the devices out there, LAPs are actually pretty evenly distributed.
One of the devices I saw, which is around 5,000, I only had one collision and that happened
at around 3,000 devices, which isn't too bad.
That's an acceptable loss for me if I'm out scanning a whole bunch of devices and I get
a collision every 3,000 or 4,000.
That's not too bad.
This graph here really doesn't mean anything.
It just kind of looked cool.
It was basically from, you know, 00 to, you know, FF, like the whole 256 bytes that
you can get.
You can get for all sections of the LAP, just how evenly it is distributed across.
You can see there's no hot spots and that kind of leads me to believe, too, that it
is pretty unique.
UAP, so the UAP is something that we do get in passive monitoring sometimes.
It depends on whether or not the traffic that goes over the wire has a payload.
So by looking at it in active devices across the board, we can kind of, you know, drive
some cool information about it.
It looks as if, you know, the whole address space is pretty much used for UAPs and there
is a hot spot for popular UAPs.
This can be used for, you know, mostly if you were to grab the LAP ‑‑ if you only
had an LAP and you wanted to just derive what the most probable UAP is, you could use
the top one.
I guess if you really wanted to, you could use it for brute force and UAPs, although
it's probably not the most effective thing.
But it was just interesting.
I saw that there was a hot spot with UAPs.
A lot of UAPs are used more than others.
And this is only the top 35 UAPs.
So basically the last, you know, 200 some of them, you know, you're getting down into
one device per UAP.
So it tails out pretty nicely.
And so this was the coolest thing.
This is ‑‑ I don't know.
I'm a nerd.
This is my favorite part of all the research.
We do not get NAPs.
So what I really wanted to do was see if I could derive an NAP based on probability of
the rest of its address space.
So in order ‑‑ you know, you need an NAP if you want to correlate a device to a
vendor.
And since we don't get this in passive monitoring, what I was looking for here is are there higher
probabilities of getting particular NAPs based on a UAP.
And so this was pretty interesting.
Right here.
I have basically this graph is just the first eight addresses ‑‑ first eight UAP addresses
based out of the 256 possibilities from 00 to 07 just for this.
But you can see for every UAP I have correlating NAPs with that.
And so what we can basically do here is say if you have a UAP, what is your most probable
NAP based on devices seen out in the wild?
As it turns out, this is actually pretty good for coming up with a high probability of what
it is.
Worst case scenario, I think there's eight NAPs associated with one or two of the UAPs
on the list.
But a one in eight probability isn't too bad.
And then you're going to be able to see which ones were actually used the most.
So you can kind of narrow that down into the highest probability.
So that was pretty interesting.
And like I said, that's your worst case scenario.
Best case scenario, which happens to be for the majority of all UAPs, is there's only
one, two, three NAPs actually associated with those UAPs.
So for the majority of the time you can narrow it down to one or a few NAPs that are actually
associated with it and then kind of increase your probability based on how many times you
saw it.
So this was interesting.
I think last year I kind of touched on this subject, but I was just looking at vendor
lists.
So my NAP probabilities were, you know, I would have 40 to 60 possible NAPs, which
was kind of completely not as useful, right?
If I can tell you I can give you three possible vendors, that's pretty good.
And so on to vendor statistics.
This stuff can be used for two different purposes, I think.
One is increasing the probabilities even more from the last slides that I just talked about.
So if you had two NAPs that were kind of tied, you could weight it based on the actual vendor,
which one is more popular.
The other cool thing to just look at is, you know, what are the most popular vendors out
there for Bluetooth?
Apple kind of takes the cake.
You know, I said Blackberry was the number one actual device, but Apple has more products.
So they're kind of taking the cake with this.
Blackberry in second.
Samsung does a lot of embedded devices, so they're pretty far up there.
Roku's was pretty interesting.
I saw a lot of Roku's during the scan, and so before I started doing a lot of the correlations,
I thought that Roku's might be the most popular device that I saw out there, but they're pretty
high up on the list.
Kind of interesting to know.
And Roku's, they transmit very far.
I was doing all this with my iPhone, and obviously I was getting Roku boxes just driving down
the road, which is crazy.
Which means, you know, I'm probably getting, you know, 50 to 70 feet based off of my iPhone.
So they're really loud.
Security.
So what does this mean for security?
You know, you can be tracked with Bluetooth.
It's something that not everyone knows.
I think that if it is something that is important to you, you know, you're probably best turning
Bluetooth off.
And on top of that, Bluetooth is a secure protocol itself, but there are vulnerabilities
that exist out there.
Right?
Usually based on software implementations, vendors who will create services that, you
know, accept connections without actual pin authentication or easy pin authentication
connections, you know, typically you'll see that where you can connect and just 0000.
So it is out there.
It is something to be aware of.
And I think, too, like if you ever wanted to do research in this realm, this was the
list that I never had.
Right?
Right.
If you wanted to get the most bang for your buck, finding, you know, the most widespread
device, this is kind of the list that, you know, I wish that I could just go down and
go for.
So it would be interesting to just start from the top and start going down and doing Bluetooth
audits on a lot of these devices.
Awareness.
You know, a lot of Bluetooth devices don't really act how you think.
Or you might have Bluetooth in places where you don't know.
Right?
A lot of people are not aware that Bluetooth is on in their car and discoverable all the
time.
A lot of people aren't aware that sometimes when you start up your car, your Bluetooth
goes on into discoverable mode for 60 seconds or longer.
So this is just kind of, you know, this happens in other devices besides car audios.
But this is just kind of something that, you know, some people might not be aware of.
And it's something that you can't turn off most of the time.
And I did notice a lot of bugs.
If you notice in my device list broken out by actual devices out there, if you're aware
of how iOS devices work, your Bluetooth actually only goes into discoverable mode.
When you go to your Bluetooth settings menu.
Yet I'm seeing so many iPads and iPhones in my scans that, you know, is it just a chance
that I'm, you know, walking by somebody when they're actually configuring the Bluetooth?
Most likely not.
So what happens, this actually happened to me multiple times when I was scanning this.
And I've never really scanned 24-7.
So it wasn't apparent to me at the time that my phone would get stuck in discoverable
mode sometimes.
It depends on how you actually leave the Bluetooth settings page in your iOS device.
If you leave too fast sometimes it just kind of gets perma stuck in discoverable mode.
So that's why you actually see iPhones and iPads in discoverable mode so much.
And I believe that the other reason why there's so many discoverable Bluetooth devices out
there is bad human computer interface, right?
Vendors just, you know, they give you that perma discoverable button when you really
don't need it.
I'm not picking on Apple here, but they did it right with iOS despite the bug that I mentioned.
But in OSX they don't, right?
They could have done the same thing whereas when you go into the configuration page you're
in discoverable.
When you leave you're not.
But they have that perma button.
I believe that happens with BlackBerry, too.
Legal issues, you know, I don't ‑‑ you know, it's kind of the same as just scanning
Wi‑Fi devices.
It does seem to be a little more personal because it's something that belongs to you
more than, you know, your home Wi‑Fi router.
But as far as legal issues go, I mean, there's really nothing out there.
And it is kind of ‑‑ I don't know.
It's kind of just based ‑‑ the closest thing would be, you know, detecting Wi‑Fi
devices.
Trans star and wireless work says, you know, if you don't want to be tracked then it is
your responsibility as a consumer to turn off your Bluetooth device.
You know, I don't know.
Not everything can be turned off.
So I don't know if that's the right answer.
So all my data sets for this stuff can be downloaded from Bluetooth database.com.
As I said, I do a dump about once a week.
All the client code.
And server code that I mentioned in this talk can be linked to from Bluetooth database.com.
Or you can go to my GitHub repo directly in order to get at this stuff.
And if you need to contact me, hate mail, whatever, Ryan at hacknar.com.
Future work.
I would like to ‑‑ currently on Bluetooth database.com, there's no real‑time statistics
except for device sightings over time.
So I'd like to take a lot of the slides that I kind of showed you today and just kind of
have, you know, like the week's most popular Bluetooth devices.
Or things like that.
Which would be pretty easy for me to add in there.
Community participation.
Obviously the reason why I'm here today.
If you guys want to participate in this, you know, feel free to install my clients.
Even if you don't want to submit it to this database.
Fire up your own server.
I supply the code for it.
Just kind of look at what's out and around you, right?
It's kind of fun to see.
And it will, you know, it will give you a wider, you know, idea of what's really going
on out there in the Bluetooth space.
Service enumeration.
I didn't add this in originally.
And I might play around with it.
I think if you ‑‑ I didn't want to do anything like too ‑‑ I wanted to be as
evasive as I totally could.
I didn't want to do RF com scannings on a lot of these things.
Mostly because I don't want to interrupt anyone's daily process, right?
RF com scanning can sometimes bring up pop‑ups on your devices and stuff like that.
So I really wasn't looking to offend anyone.
And a passive survey.
I think that by comparing ‑‑ by doing a large‑scale passive survey and kind of
comparing the data sets, it would lead to a lot of interesting space.
Like you would get a wider view of what the actual Bluetooth deployment is out there.
So you can kind of compare the sets.
You would see, okay, out of X amount of discoverable, how many passive do you typically see in an
area?
Things of this nature.
It would be cool to do with standard rate and Bluetooth low energy, right?
I don't believe anyone has really done it.
It's a large‑scale Bluetooth low energy scan, which would be kind of cool to see.
So that is it.
I almost clapped for myself.
